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ABSTRACT 

Analyses of peculiar velocity surveys face several challenges, including low 
signal-to-noise in individual velocity measurements and the presence of small- 
scale, nonlinear flows. This is the second in a series of papers in which we describe 
a new method of overcoming these problems by using data compression as a filter 
with which to separate large-scale, linear flows from small-scale noise that can 
bias results. We demonstrate the effectiveness of our method using realistic 
catalogs of galaxy velocities drawn from N-body simulations. Our tests show 
that a likelihood analysis of simulated catalogs that uses all of the information 
contained in the peculiar velocities results in a bias in the estimation of the 
power spectrum shape parameter T and amplitude (3, and that our method of 
analysis effectively removes this bias. We expect that this new method will cause 
peculiar velocity surveys to re-emerge as a useful tool to determine cosmological 
parameters. 

Subject headings: cosmology: distance scales - cosmology: large scale structure of the 
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1. INTRODUCTION 

Although in principle the measurement of galaxy motions holds great promise as a probe 
of large-scale structure, in practice there are several obstacles that have prevented there 
from being robust conclusions made from the analyses of these measurements. First, these 
measurements are inherently noisy; errors in peculiar velocity determinations are typically of 
order 10% of the redshift of a galaxy or cluster, which for all but nearby objects is comparable 
or larger than the velocity being measured. From the theoretical side, in order to relate the 
velocity field to the underlying matter density one must assume that the fields are linear. 
While this approximation is accurate on large scales, on smaller scales it generally fails due to 
infall into density concentrations. The difficulty in surmounting these obstacles is illustrated 
in the fact that attempts to compare different velocity field surveys have shown significant 
disagreements (Watkins & Feldman, 1995; Hudson et al. 1999). 

Analyses of catalogs of peculiar velocity measurements have usually taken either of two 
main approaches. One is to average all the velocities to find the bulk flow, the velocity of the 
volume occupied by the survey relative to the Universal rest frame defined by the CMBR 
(e.g. Lauer & Postman (1994); Riess, Press & Kirshner (1995); Branchini, Plionis & Sciama 
(1996); Colless etal (2001); Aghanim & Grski (2001)). This method has the disadvantage 
that it discards most of the information contained in a survey and measures only three 
quantities, the components of the bulk flow vector. The second approach is to use all of 
the information contained in the survey in a likelihood analysis in order to obtain maximum 
likelihood estimates of the power spectrum parameters (e.g. Zaroubi et al. (2001); Yang 
et al. (2001); Zehavi & Dekel (2000); Freudling et al. (1999)). This method potentially 
suffers from a bias due to small-scale, nonlinear contributions to velocities. 

In a recent paper (Watkins et al. (2002), hereafter Paper I), we introduced a new method 
for the analysis of peculiar velocity surveys that is a significant improvement over previous 
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methods. In particular, our formalism allows us to separate information about large-scale 
flows from information about small scales, the latter which can then be discarded in the 
analysis. By applying specific criteria, we are able to retain the maximum information 
about large scales needed to place the strongest constraints, while removing the bias that 
small scale information can introduce into the results. 

In paper I we reported on preliminary tests that suggested that our analysis method was 
effective. Here we present results from more extensive testing that demonstrates conclusively 
that our method works as advertised. In particular, we show that a likelihood analysis of 
simulated catalogs that uses all of the information contained in the peculiar velocities results 
in a bias in the estimation of the power spectrum parameters T and f3, and that our method 
of analysis effectively removes this bias. 

The paper is organized as followed: in sections 2 and 3 we briefly review the analysis 
formalism introduced in Paper I. In section 4 we discuss the N-body simulations we used 
and how we constructed synthetic catalogs. In section 5 we discuss results from our synthetic 
catalogs for both a full likelihood analysis of the catalog as well as our new analysis method. 
We also present additional evidence that our method is working effectively. In section 6 we 
conclude. 

2. The Formalism 

Our starting point is the usual statistical model for the line-of-sight peculiar velocities 
of galaxies. First, we assume that galaxies are tracers of a large-scale linear velocity field 
v(x), a Gaussian random field completely described by the velocity power spectrum P^{k). 
With the assumption of linearity, the velocity power spectrum is proportional to the power 
spectrum of density fluctuations, with P( v ){k) = H^il 1 - 2 P(k). 
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For a set of iV galaxies with positions r i; the observed line-of-sight velocity will be given 

by 

u i = v(r i )-f i + <J i , (1) 

the sum of the radial component of the velocity field with a noise term <5j describing both 
observational error and any deviation from the linear velocity field due to local gravitational 
interactions. For simplicity we assume that <5j is distributed as a Gaussian random variable 
with variance of + of, where (7j is the observational error associated with that particular 
galaxy, and cr* describes all other effects and is assumed to be the same for all the galaxies 
in the set. 

With these assumptions, we can construct the probability distribution for the set of 
measured line-of-sight velocities given a power spectrum P(k), 

L( Vl ,...,v N ;P(k)) = v^iR exp -v^vj^ , (2) 

where R^ = (vi Vj) is the covariance matrix, which in this case takes the form 

Ry = < } + 8 i3 (of + al) (3) 

where = (v(rj) - fj v(i\,-) -rj) and the second diagonal term is due to the noise. In linear 
theory, the "signal" part of the covariance matrix can be written as an integral over the 
density power spectrum 

^ = J p (k)wm dk, (4) 

where W^(k) is a tensor window function calculated from the set of positions of the 
galaxies with galaxy velocities weighted by their relative error (for more details see Feldman 
& Watkins (1994); Watkins & Feldman, (1995); Feldman & Watkins (1998)). 

Typically we are given the a catalog of measured velocities (vi, ...,vn) and wish to deter- 
mine P(k). Thus we can view L(vi, ...,vn', P(k)) as a likelihood functional; given (v±, ...,vn) 
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we use Eq. (2) to determine the likelihood that they were generated in a Universe with a 
particular P(k). 

An analysis of the type described above is susceptible to biases due to small-scale, 
nonlinear contributions to the galaxy velocities. The method of analysis developed in Paper 
I eliminates this bias by replacing the full set of N line-of-sight velocities Vi with moments Ui, 
which are designed so that the Nth moment is the linear combination that carries the most 
information about small scales, the (N — l)th moment is an independent linear combination 
which carries the second most small-scale information, etc. By using a subset of the first 
N' < N moments in our analysis and discarding the rest, we can essentially filter out the 
small scale information which may carry nonlinear contributions. Our method is based on 
Karhunen-Loeve methods of data compression (Kenney & Keeping 1954; Kendall & Stuart 
1969); see also Tegmark, Taylor & Heavens (1997), designed to concentrate most of the 
information in a large set of data into a smaller, more manageable number of moments. 
However, our method puts a twist on this idea by concentrating unwanted information 
regarding small scales into a small set of moments which can then be discarded. What 
follows is a brief review of the mechanics of our method; for details, see Paper I. 

Our method is based on linear data compression, so that the moment u n can be written 
in terms of the line-of-sight velocities V; L as 

N 

Un = ^2(b n )j Vj , (5) 

where the b n are a set of vectors of length N. With this definition, the covariance matrix of 
the new moments is given as 
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It is convenient to choose the b n so that the u n are linearly independent and of unit variance, 
so that R nm is the identity matrix. Note that this normalization will hold only for a particular 
matrix Rij and hence a particular power spectrum. 

In order to find the vector b n such that the moment u n carries the maximum information 
about nonlinear scales, we assume a simple model for the power spectrum in which the 
amount of power on scales below that where density fluctuations have gone nonlinear is 
specified by a single parameter 9 q . Given a single moment u n , we can determine the value 
of 9 q to within a minimum variance given by A6^ = 1/F qq , where F qq is the qqih element 
of the Fisher information matrix, which in this case, and with the normalization assumed 
above, can be shown to take the form 

^-EK(M^(^) 2 (7) 

We can thus find the single moment u n that carries the maximum information about 
6 q by finding the vector b n which minimizes F qq subject to the normalization condition 
discussed above, which functions as a constraint. After introducing a Lagrange multiplier, 
the minimization results in an eigenvalue problem 



i,j,m V <? / j 

where Ly is the Cholesky decomposition of the covariance matrix, = X^Li F ip Lj P . 



(8) 



Solving this eigenvalue problem gives us a set of N orthogonal eigenvectors ■ Lji(b n )j 
with eigenvalues X n . Each eigenvector has a corresponding moment u n = ^^(^n)^- The 
eigenvalue \ n of a moment u n is related to the error bar A9 q that one could place on 9 q using 
the single moment u n , as can be seen by manipulating the equations above: 

q i,j q 1,3 
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so that A9 q = 2/A 2 . If we order the moments u n in order of increasing eigenvalue, 

|Ai| < |A 2 | < ... < |Ajv| (10) 

then we can interpret each moment as carrying successively more information about 9 q , with 
«iv carrying the maximum possible amount of information. Since our goal is to produce a 
data set that is less sensitive to the value of 9 q than the original data, we should keep moments 
only up to some N'. The orthogonality of the eigenvectors ensures that the moments are 
statistically independent, as we assumed above. Thus if we compress the data by discarding 
the moments with a large value of |A|, the information contained in those moments will be 
completely removed from the data. However, we would also like to keep as many moments 
as possible in order to retain the maximum information about large scales. 

In order to choose a value of N', we need to examine what error bar A6 q we can put on 
the parameter 6 q using the compressed data. Since the moments are independent, we can 
write the Fisher matrix for the N' moments that were not discarded as 




n=l 



so that the error bar that can be put on 6 q using the compressed data is given by 




This suggests that N' should be chosen by adding up the sum of the squares of the smallest 
eigenvalues until the desired sensitivity is reached. The criterion that we use is as follows: 
First, we estimate the actual size of the parameter 9 q = 9 qo from peculiar velocity data. 
Then, we keep the largest number N' moments that is still consistent with the requirement 
that A6 q > 9 qo . With this requirement, as long as our estimate of the true value of 9 q is 
correct, our final set of moments U\ . . . Un> will not contain enough information to distinguish 
the value of 6 q from zero. 
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3. Other Selection Criteria 

Our method of selecting moments by their lack of information about small scales has 
the disadvantage of not discarding moments which have little information about any scale; 
this issue was briefly touched on in Paper I. Thus we have developed a second criterion for 
moment selection whereby we discard moments that are dominated by noise. That is, they 
have no cosmologically useful information. 

Recall from Eq. 3 that the covariance matrix for the line-of-sight velocities is the sum 
of a "signal" part and a noise part. Since the covariance matrix for the moments u n is 
essentially a "rotation" of the velocity covariance matrix, this matrix can separated in a 
similar fashion, 

Rnm = yi(&n)i-Rij(&m)j = ^(^Mf ( b m) j + 5^(&n)i£«(<7i + a *)( b m)j 
ij ij ij 

= R { 1 + $>nMM^ 2 + *2) (13) 

i 

where the second term is the noise contribution to the variance of the moment. Given that 
the b n are normalized such that the moments are independent and have unit variance, i.e. 
that R is the identity matrix, we see that the quantity 

i 

is a measure of the fraction of the variance of the moment u n that is due to noise. If £ n <C 1, 
then the moment has very little noise and should be retained. If, on the other hand, £ n w 1, 
then the value of the moment is mostly determined by the errors in the data and should be 
discarded. 

Generally there is a correlation between a moment's £ n and its eigenvalue |A n |; moments 
that are most sensitive to small scales tend to be very noisy. Low noise moments tend to 
be those that probe large-scale power. This is due to the fact that the measurement errors 
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in the velocities, which vary independently from galaxy to galaxy are much more effective 
at masking small scale modes than large scale modes. However, we have found that some 
moments with small eigenvalues also have large noise; these moments carry little information 
about any scale. 

The correlation between and |A n | suggests that our two selection criterion can be 
somewhat redundant; eliminating noisy moments often can also accomplish the goal of re- 
moving moments with large |A n |. Similarly, eliminating moments with large |A n | leaves one 
with moments which generally have smaller noise. Thus we will see below that applying the 
second criteria after we have already applied the first typically does not change the results 
of the analysis significantly. 

Once one has a set of moments which have both small £ n and |A n |, it is desirable to 
have a way of determining which scales each moment is most sensitive to. From Eq. 4 we 
recall that the "signal" part of the covariance matrix for the velocities is given by an 
integral of a tensor window function Wij(k) with the power spectrum. The "signal" part 
of the covariance matrix R of the moments is given by a "rotation" of this tensor window 
function. Since R is diagonal, this results in a scalar window function for each moment u n , 

W 2 n {k) = {b n ) % W*{k){b n ) (15) 

By examining the window functions for each moment, we can determine which scales the 
moments are sensitive to and confirm that our method is working. In principle, examination 
of the window functions could also provide a further criterion for the discarding of moments. 

4. Synthetic Catalogs 

The simulations used here are numerical models for the gravitational dynamics of colli- 
sionless particles in an expanding background. We are studying evolution of initial Gaussian 
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perturbations in a matter-dominated universe. All the simulations are done with a particle- 
mesh (PM) code with 256 3 particles in an equal number of grid points (Melott 1986; Melott, 
Weinberg & Gott 1988). More details about the peculiarities of the simulations used here 
can be seen in Melott & Shandarin (1993). Although the parameter appears in both the 
dynamics of the expanding background and as a parameter in the fit to the power spectrum 
shape in the CDM family of models (Bardeen et al. 1986), they serve two different functions. 

We ran simulations with a dynamical background Q = 1. or 0.34. These were nor- 
malized to an amplitude <T8=0.93 at redshift moment z—0, but we also took data at z—1, 
which for our purposes can be described as studying a Universe with a lower perturbation 
amplitude normalization, and possibly a higher Q . All models were interpreted with an 
assumed Hubble Constant H= 100/ikm s~ 1 Mpc _1 where h = 2/3. When power spectra are 
parameterized in Mpc, the shape is dependent upon Q h 2 , which we set equal to 0.15, 0.35, 
and 1.0 for our low-fi tests. We have used some values which are inconsistent with other 
constraints in CDM linear theory in order to test our method over a wide range of values. 
We also ran another simulation with f2 = 1, and £l h 2 = 0.15; such models have been called 
rCDM in the past. The set with Q = 0.34, Q h 2 = 0.15 is most consistent with a variety of 
findings at this time, but we do not wish to test our method only against currently favored 
cosmologies. There are a variety of alternative models in addition to the cosmological con- 
stant A; all have very small and totally linear effects on large-scale velocities. We omit this 
in favor of wider exploration of parameter shifts which have large effects. 

For testing our method we created synthetic redshift-distance catalogs from the 256 3 
A r -body PM simulations. In these simulations, the box size was taken to equal 512 Mpc, or 
34,133 km s" 1 in redshift space for h—2/3. 

Each of the points defined by the mesh represented a galaxy with a corresponding 
location and velocity. Testing the optimal moments method accurately requires proper 
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modeling of statistical errors so we wanted to be sure to include the effects of cosmic variance 
and scatter in distance indicators. We chose three coordinates of scale 1/6, 1/2 and 5/6 the 
box width. An exhaustive permutation of combining these coordinates results in 27 center 
locations within the box, each corresponding to an individual synthetic catalog as described 
next. About each central location, an annular volume was defined by the redshift range 500 
km s^ 1 < z < 10,000 km s _1 . If the volume intercepted the boundaries of the box then 
we appealed to the periodic boundary conditions of the simulation and included galaxies 
from the opposite box side. Within each of the defined regions, we selected 1000 galaxies 
under the assumption of a radial selection function and the additional requirement that 
there is a zone of avoidance below the galactic latitude |6| < 10°. The selection function 
was chosen generically (we ignored galactic properties) to mimic existing popular redshift- 
distance surveys (i.e. the SFI survey; da Costa et al. (1996)). The effect of scatter in 
distance indicators was replicated by adding a random error to each peculiar velocity drawn 
from a Gaussian distribution of width 10% of the galaxy redshift distance. 

In the end, for each simulation box we have 27 surveys sampling the simulation that 
give information about the positions and radial velocities of the galaxy distribution in some 
volume. We analyzed these surveys by using the actual positions (i.e. no errors) and by 
perturbing the velocities with a 10% Gaussian error. The value of sampling the simulation 
in this way is that we are able to model the effects of cosmic variance. The errorbars of 
the unperturbed catalogs are predominantly from the cosmic variance, whereas those of the 
perturbed, 10% catalogs include the effects of both cosmic variance and the inaccuracies of 
the distance indicators. 



5. Results 



The main purpose of the formalism we presented here and in Paper I was to allow the 
removal or filtering of small scale noise while keeping the large scale signal. To test the 
success of the formalism we have created synthetic surveys from simulations with known 
parameters, specifically, T, the CDM power spectrum shape parameter, and /?, its amplitude 
(the values of the parameters we simulated is given in the table 1 below.) To compare our 
method with the full analysis method, we reemphasize that the optimal moment analysis 
presented here allows for two semi-independent methods of cleaning up a survey: 1) Ordering 
the moments by their eigenvalues (Eq. 10); and 2) Removing the noisiest moments (Eq. 14). 
In Figs. 1 and 3-6 we show the comparison between choosing the modes least susceptible to 
small scale signal (open triangles); those that are least susceptible to small scale signal and 
are not noisy (open squares); and the full analysis (that is, keeping all moments, the usual 
analysis, solid triangles). We see that the full analysis fails to recover the "true" parameters 
by a significant amount (~ 4a for no errors and > 2a for 10% errors). In contrast, the mode 
analysis recovers the values of the parameters very well, with or without the removal of the 
noisy moments. 

For each one of the models we simulated we have extracted 27 catalogs from each 
simulation box, as described in Sec. 4. The points and errorbars in the figures are the 
maximum likelihood mean and standard deviation of the mean for the analysis of all catalogs. 
Each one of the catalogs were analyzed using the full maximum likelihood analysis, keeping 
all moments; the maximum likelihood analysis discarding large eigenvalue modes; and the 
maximum likelihood mode analysis without the noisiest moments (£ > 0.95). 

In Fig. 2 we show the value of the estimated parameters as a function of the SA 2 (see 
Eq. 12) where we see that as the number of modes is increased, we get closer and closer to 
the "true" value. When we keep more than the number of moments that corresponds to the 
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Table 1: The values of the parameters for the simulations used to extract the catalogs we 
used to test our formalism. IV and (3t are the "true" values of our parameters for each simu- 
lation. The Analysis column denotes our proposed formalism (mode) and the full maximum 
likelihood analysis (full). The < T > and < (3 > columns show the mean and the standard 
deviation of the mean of the parameters for each simulation. The crr and dp columns are the 
standard deviations for the parameters. When comparing the maximum likelihood values, 
the mode analysis does a much better job recovering to the "true" values then does the full 
analysis. 

fulfillment of our criterion (Eq. 12), the values start diverging from the "true" results. This 
is due to the fact that small-scale modes that have become nonlinear are introducing a bias. 
This tendency of the full analysis to systematically overestimate the parameter values can 
be seen in the analyses done for simulations with various cosmological parameters, Figs. 1 
and 3-6. 

We have experimented with the choice of N' , the number of modes to keep, as discussed 
in the text after Eq. 12. This choice depends on our power spectrum and more specifically 
on k n i, the wavenumber of the largest scale for which density perturbations have become 
nonlinear (see Paper I). We chose k n \ by comparing the linear power spectrum from the 
initial conditions of the simulation to the power spectrum at the end of the run. k n i is where 
the power spectra started to diverge. In general we found that k ni 0.2, though choosing 
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0.15 < k n i < 0.4 did not affect our results significantly. Further, as can be seen in Fig. 2, 
the N' choice need not be finely tuned. 

As was discussed in the text, the reason for the full analysis failure to recover the "true" 
parameters when the mode analysis succeeds so well can be shown by looking at the window 
functions themselves. In fig. 7 we show the window functions corresponding to the five 
lowest eigenvalues and lowest noise (lower left panel). Clearly, these probe only large scales. 
As we move up the panels we see the window functions with larger noise components not 
removed, whereas when we move to the right we see window functions corresponding to 
larger eigenvalues. Here the reasons for the particular choices for our criteria Eqs. (10) and 
(12) become clear. As the eigenvalues or the noise level become large, the window functions 
generally probe more small scale and less of large scale modes. Since we are primarily 
interested in large scale information, discarding the noisy, high A modes allows us to discard 
small-scale signal that might interfere with with our analysis. 

One more advantage the formalism provides is efficiency. For a catalog of ~ 1,000 
galaxies it takes the mode analysis about one hour CPU time, whereas it takes the full 
analysis about seven hours to complete. The differences are more dramatic for larger surveys: 
A 5,000 galaxy catalog completes in about 30 hours with the mode analysis and about 1,300 
hours of CPU time for the full analysis. All runs were done on the Origin2000 at the NCSA, 
University of Illinois, Urbana-Champaign. 

As was mentioned in the analysis (Sec. 3) there is a general correlation between the n th 
moment's eigenvalue A n and the noise £ n associated with it (dots in Fig. 8). As can be seen 
in the figure, the correlation is not perfect, that is, there are low eigenvalues with large noise 
component, but in general as we move to larger eigenvalues (large n) the noise component is 
larger. The line in the figure represents the running mean of the noise which shows clearly 
the correlations between the noise and the eigenvalue. 
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In Fig. 9 we show the contours that contain 68% and 94% of the total likelihood for six 
typical catalogs. The diamond shows the maximum likelihood results, whereas the asterix in 
each panel shows the "true" values of the parameters. These contours allow us to estimate 
the uncertainty in the maximum likelihood values obtained from the analysis of a single 
catalog, as is the case when analyzing observational data. From the figures it is clear that 
the uncertainties obtained in this way are comparable to those we get from the Monte-Carlo 
simulations. In general, when we try to test the reliability of results from an observational 
data set, we apply our formalism to mock catalogs extracted from N-body simulations as 
was done here. This compatibility between the uncertainties obtained in two different ways 
gives us confidence that using the likelihood contours will give us an accurate assessment 
of the uncertainties of our maximum likelihood values when we apply our method to real 
catalogs. 



6. Conclusions 

We have described the power and elegance of a new statistic that was designed and 
formulated in order to address a crisis in the analysis of proper distance cosmological surveys. 
We have shown that our formalism mostly overcomes the problems with the traditional 
analysis of the data. Whereas the full maximum likelihood analysis tends to overestimate 
the values of the parameters that describe the power distribution on large scale, our mode 
analysis makes very accurate estimates of these parameters. 

The formalism presented here assumes Gaussian statistics. The natural question should 
be: Can the deviations from Gaussianity caused by the collapse of perturbations interfere 
with the removal of small scale power and introduce additional unpredictable biases? As 
the results in Sec. 5 indicate, deviations from Gaussianity do not have a measurable effect 
and the effectiveness of filtering small-scale power is unbiased. Further, we have explored 
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in detail such issues as moment and noise selection, the window functions' effectiveness and 
criteria for which modes to keep. 

As was shown in Paper I and in more detail here, the formalism we presented is highly 
adaptive and versatile. It can be applied surveys with any geometry and density, and since 
it retains maximum information should be particularly useful for sparse data such as that 
obtained in cluster peculiar velocity surveys. Overall, we consider this method to be a 
significant improvement over previous methods used for the analysis of peculiar velocity 
data. 
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Foundation under grant number AST-0070702, the University of Kansas General Research 
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Hebrew University, Jerusalem, Israel and by the Institute of Theoretical Physics at the 
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Fig. 1. — A comparison between the mode analysis presented in this paper and the traditional 
full analysis. The results are from 81 catalogs extracted from the simulations (see Sec. 4). 
In the top two panels are the mean values and standard deviations of the mean of j3, the 
amplitude of the power spectrum. The bottom panels we show the results for estimating 
T, the shape parameter. In the left panels we have the results for the analysis for a survey 
with no errors whereas the right panels show the results for 10% errors. The solid symbols 
are the full analysis results and the empty ones are the mode analysis. The triangles are the 
results without removing the noisy moments, the rectangles are those where we removed the 
noisiest moments. The horizontal lines are the "true" values of the parameters (T = 0.225, 
(3 = 0.455). It is clear that the full analysis fails to recover the parameter values whereas 
the mode analysis recovers them well. 
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Fig. 2. — The mean value of the estimated parameters from 81 catalogs extracted from the 
simulations (see Sec. 4) as a function of the number of modes we keep. The top panel shows 
results for survey with no errors, the bottom panel shows the results with distance errors 
of 10%. It is clear that as the number of modes kept increases beyond the criteria set, the 
estimators become more and more biased. The horizontal lines are the "true" values of the 
parameters (T = 0.225, f3 = 0.455). 
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Fig. 3. — The same as fig 1 with different cosmological parameters (T = 0.255, (3 = 0.372). 
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Fig. 4. — The same as fig 1 with different cosmological parameters (T = 0.515, (3 = 0.325). 
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Fig. 5. — The same as fig 1 with different cosmological parameters (T = 0.51, (3 = 0.41). 
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Fig. 6. — The same as fig 1 with different cosmological parameters (T = 0.667, (3 = 0.32). 
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Fig. 7. — The window functions from top to bottom corresponding to noise in the ranges 
of 0.98 < £, 0.95 < £ < 0.98, 0.9 < £ < 0.95 and £ < 0.9 respectively, and across from left 
low, med and high eigenvalues A respectively. We can clearly see that the low eigenvalue low 
noise window functions (lower left panel) probe large scale (small k), whereas higher noise, 
larger eigenvalue window functions (up and to the right) correspond to smaller scales probes. 
Further, the high noise window functions probe scales that are hard to model as are those 
with large eigenvalues. 
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Fig. 8. — The noise £ (Eq. 14) as a function of the rank n of the eigenvalue A n (Eq. 12) 
(shown as dots). On the average there is an excellent correlation between the rank and the 
noise component. 
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Fig. 9. — Maximum likelihood contours from six typical mock catalogs. The contours are 
the 68% and 94% likelihood lines. This shows the expected uncertainties in the analysis of 
one catalogs. In most cases the uncertainties in the estimated values of the parameters T 
and (3 are of comparable sizes to the monte carlo errorbars presented in figures 1-6. 



